Identifying Antigenic Switching by Clonal Cell Barcoding and Nanopore Sequencing in Trypanosoma brucei

Many organisms alternate the expression of genes from large gene sets or gene families to adapt to environmental cues or immune pressure. The single-celled protozoan pathogen Trypanosoma brucei spp. periodically changes its homogeneous surface coat of variant surface glycoproteins (VSGs) to evade host antibodies during infection. This pathogen expresses one out of ~2,500 VSG genes at a time from telomeric expression sites (ESs) and periodically changes their expression by transcriptional switching or recombination. Attempts to track VSG switching have previously relied on genetic modifications of ES sequences with drug-selectable markers or genes encoding fluorescent proteins. However, genetic modifications of the ESs can interfere with the binding of proteins that control VSG transcription and/or recombination, thus affecting VSG expression and switching. Other approaches include Illumina sequencing of the VSG repertoire, which shows VSGs expressed in the population rather than cell switching; the Illumina short reads often limit the distinction of the large set of VSG genes. Here, we describe a methodology to study antigenic switching without modifications of the ES sequences. Our protocol enables the detection of VSG switching at nucleotide resolution using multiplexed clonal cell barcoding to track cells and nanopore sequencing to identify cell-specific VSG expression. We also developed a computational pipeline that takes DNA sequences and outputs VSGs expressed by cell clones. This protocol can be adapted to study clonal cell expression of large gene families in prokaryotes or eukaryotes. Key features • This protocol enables the analysis of variant surface glycoproteins (VSG) switching in T. brucei without modifying the expression site sequences. • It uses a streamlined computational pipeline that takes fastq DNA sequences and outputs expressed VSG genes by each parasite clone. • The protocol leverages the long reads sequencing capacity of the Oxford nanopore sequencing technology, which enables accurate identification of the expressed VSGs. • The protocol requires approximately eight to nine days to complete.

This protocol is used in: eLife (2023), DOI: 10.7554/eLife.89331.4 Many organisms alternate the expression of genes from large gene sets or gene families to adapt to environmental cues or immune pressure.The single-celled protozoan pathogen Trypanosoma brucei spp.periodically changes its homogeneous surface coat of variant surface glycoproteins (VSGs) to evade host antibodies during infection.This pathogen expresses one out of ~2,500 VSG genes at a time from telomeric expression sites (ESs) and periodically changes their expression by transcriptional switching or recombination.Attempts to track VSG switching have previously relied on genetic modifications of ES sequences with drug-selectable markers or genes encoding fluorescent proteins.However, genetic modifications of the ESs can interfere with the binding of proteins that control VSG transcription and/or recombination, thus affecting VSG expression and switching.Other approaches include Illumina sequencing of the VSG repertoire, which shows VSGs expressed in the population rather than cell switching; the Illumina short reads often limit the distinction of the large set of VSG genes.Here, we describe a methodology to study antigenic switching without modifications of the ES sequences.Our protocol enables the detection of VSG switching at nucleotide resolution using multiplexed clonal cell barcoding to track cells and nanopore sequencing to identify cell-specific VSG expression.We also developed a computational pipeline that takes DNA sequences and outputs VSGs expressed by cell clones.This protocol can be adapted to study clonal cell expression of large gene families in prokaryotes or eukaryotes.

Background
Trypanosoma brucei is a single-celled protozoan parasite that causes African trypanosomiasis and evades the host antibody response by changing its surface coat by antigenic variation (Cestari and Stuart, 2018).T. brucei expresses a single variant surface glycoprotein (VSG) gene from one of the 20 telomeric expression sites (ESs) and periodically switches to a different VSG via transcriptional switching between ESs or by VSG gene recombination.T. brucei has an extensive repertoire of over 2,500 VSG genes and pseudogenes located in telomeric and subtelomeric regions of large chromosomes.VSG genes are also found on dozens of mini chromosomes often used for VSG gene recombination.VSG genes are approximately 2 kb in length with conserved C-terminus sequences.VSG recombination can occur by gene or segmental gene conversion, resulting in new mosaic VSG sequences.The mechanisms controlling VSG monogenic expression and switching likely entail multiple processes, including controlling VSG repression and expression via proteins associated with telomeric ESs.Several proteins associate with the telomeric repeats or ESs to regulate VSG gene expression and/or switching, such as the repressor activator protein 1 (Touray et al., 2023), phosphatidylinositol phosphate 5-phosphatase (PIP5Pase) (Cestari et al., 2019), telomeric repeat-binding factor (Jehi et al., 2014), VSG exclusion protein 2 (Faria et al., 2019), and ES body 1 protein (López-Escobar et al., 2022); for a review on additional proteins controlling VSG expression and switching, see (Cestari and Stuart, 2018).Approaches used to study VSG switching rely on genetic modifications that disrupt the ES DNA sequences by incorporating drug-selectable markers or fluorescent proteins downstream of the promoter sequence and upstream of the VSG gene (Rudenko, 1998;Ulbert et al., 2002;Aitcheson et al., 2005) or by adding exogenous endonuclease sites resulting in DNA breaks (Boothroyd et al., 2009;Glover et al., 2013).However, the ES modifications might disrupt protein binding sites and thus affect VSG switching rates; as an example, RAP1 binds to 70 bp and telomeric repeats flanking ES VSG genes and represses their transcription, and disruption of its binding dramatically increases VSG switching rates (Touray et al., 2023).In addition, the genetic modifications of ESs are laborious and restrict the use of drug-selectable markers available for other genetic changes, such as gene knockout or expression of mutant variants in the cell.Other studies used Illumina RNA-seq to investigate VSG expression at a population level (Mugnier et al., 2015).Although this approach helps to identify expressed VSG genes, many short reads fail to align uniquely to the genome and unambiguously identify and distinguish VSG genes expressed from the extensive repertoire of VSG genes/pseudogenes.Moreover, it does not identify switching cells but expressed VSG genes in the population.Hence, we sought to develop an approach to track cell-specific antigenic switching without genetically modifying ES sequences in an adaptable high-or medium-throughput fashion.We devised a method to detect VSG switching at nucleotide resolution using clonal cell barcoding and nanopore sequencing.We combine DNA barcoding to Solutions n/a 900 mL Dissolve the IMDM and salts in 500 mL of water and stir to mix thoroughly.Add 10 mL of hypoxanthine, 10 mL of sodium pyruvate, and 14 µL of 2-mercaptoethanol.Adjust water volume to 900 mL.Filter the medium using a 0.22 µm filter system of 250 mL.Add 100 units of sterile penicillin and 100 µg/mL sterile streptomycin (optional).Complete the media by adding 10% heat-inactivated FBS.Store at 4 °C.The shelf life of the medium is four months.n/a 500 mL Total n/a 500 mL Dissolve the salts in 300 mL of water.Add D-glucose and stir thoroughly to dissolve it.Weigh the volume of glycerol equivalent to 75 g in a separate container and add it to the solution.Adjust the volume of the solution to 500 mL and stir thoroughly.Filter using a 0.22 µm filter.Store it at 4 °C.The solution is stable for six months.n/a 500 mL Prepare a stock solution of 0.5 M EDTA in a separate tube by dissolving 18.6 g of EDTA disodium salt in 80 mL of water.Adjust the pH to 8.0 with 10 M NaOH solution.Adjust the volume to 100 mL.Dissolve the Tris base in 300 mL of water and add 28.6 mL of 17.4 M glacial acetic acid and 50 mL of 0.5 M EDTA.Adjust the volume of the solution to 500 mL with water.Autoclave the solution.Store it at room temperature (RT).The solution is stable for at least six months.

A. Parasite treatment and cloning
We recommend determining cell treatment conditions before starting this protocol.The conditions used in this protocol were optimized for T. brucei bloodstream forms of the 427 strain or conditional null (CN) cells derived from the single-marker 427 strain (Cestari and Stuart, 2015).The treatment described here is the knockdown for 24 h of the T. brucei gene encoding PIP5Pase, which results in high rates of VSG switching (Touray et al., 2023).The PIP5Pase CN cell line is grown in G418 and phleomycin to maintain the selection of the tetracycline-inducible system.Tetracycline is added to induce expression of the PIP5Pase gene under the control of a procyclin promoter and a tetracycline operator.1. Grow 5 mL of T. brucei cells seeded at 1.0 × 10 4 cells/mL in HMI-9 medium supplemented with 2 µg/mL G418, 2.5 µg/mL phleomycin, and 500 ng/mL tetracycline in a 37 °C incubator with 5% CO2 for 24 h or until it reaches mid-log growth (~1.0 × 10 6 cells/mL).Throughout this protocol, cell growth will be as described above unless otherwise stated.Cells' doubling time should be approximately 5.5-6 h and viability approximately 90%-95%.Avoid overgrowing cell culture (>1.5 × 10 6 cells/mL) because it will affect cell viability.2. Transfer the 5 mL cell culture to a 15 mL Falcon tube and centrifuge at 3,500× g for 5 min at RT. Discard the supernatant.3. Resuspend the pellet in 10 mL of PBS-G pre-warmed at 37 °C and then centrifuge the cells as in step A2.
Discard the supernatant.4. Repeat step A3 three times to ensure complete removal of tetracycline.5. Split the cells into two 5 mL cell culture flasks (treated and non-treated groups), seeding each culture at 1.0 × 10 4 cells/mL in HMI-9 medium with 2 µg/mL G418 and 2.5 µg/mL phleomycin.6. Add 500 ng/mL tetracycline to the non-treated flask (Tet +, control) and no tetracycline to the treatment flask (Tet -, knockdown) and grow the cells for 24 h.7. Add 500 ng/mL tetracycline to the treatment flask (Tet -, knockdown).No additional tetracycline is required for the non-treated flask (Tet +, control).Quantify the cell concentration and viability of both the treatment and control groups by mixing 10 µL of cell culture and 10 µL of 0.4% Trypan blue staining.Add 10 µL to the CellDrop TM cell counter to obtain viability and cell concentration.Cell concentration should be approximately 1.0 × 10 5 cells/mL, and viability should be >90%.8. Add 9 mL of HMI-9 medium supplemented with 500 ng/mL tetracycline to a 50 mL Falcon tube.Repeat the procedure to have three flasks for the treatment group and three for the non-treatment group.
13. Place the EZ-10 96-well plate into a new deep-well storage plate (provided in the kit), add 30 µL of RNasefree water (supplied in the kit), and then incubate the plate at RT for 5 min.14.Centrifuge the plate at 5,000× g for 1 min at RT to elute the RNA solution.15.Quantify the recovered RNA by measuring 1 µL of the RNA solution at 260 nm from approximately 10 random wells using a NanoDrop.This will help estimate the isolated RNA concentrations.16.Tightly seal the plate with an adhesive cover and keep it on ice (or at -80 °C) until the RNA samples are ready for cDNA synthesis.

C. VSG-enriched cDNA synthesis and barcoding
This step requires a combination of primers (Figure 1) to barcode each clonal cell population cDNAs with a unique eight-nucleotide sequence for their identification during sequencing analysis.It will also provide an adapter sequence for DNA sequencing library preparation.The forward Ad-SL (5′-TTTCTGTTGGTGCTGATATTGCacagtttctgtactatattg-3′) primer is universal and includes an Oxford nanopore adaptor sequence (capital letters) followed by a sequence (small letters) that hybridizes to mRNA splice leader sequence, a 39-nt sequence added to the 5′ of all trypanosomes' mRNAs.The reverse Ad-3endVSG primer (5′-TACTTGCCTGTCGCTCTATCTTCXXXXXXXXgtgttaaaatatatc-3′) contains an Oxford nanopore adaptor sequence (capital letters) followed by eight-nucleotide long variable sequence (barcode) unique to each clone and a sequence pairing with the conserved 3′-end of VSG mRNAs, which encodes the C-terminus of VSG proteins (Mugnier et al., 2015).See Supplementary information for the complete primer list.We recommend preparing a working primer solution containing a mix of both primers at 10 µM in a 96-well plate.
1. Take 4 μL of primer mix from the working primer solution plate and transfer it to each well of a new 96well PCR plate using a multichannel pipette.2. Add 1 μL of 10 mM dNTPs mix onto each well of the same 96-well PCR plate.3. Thaw the RNA samples (from step B16) on ice if frozen at -80 °C and add 5 μL of the samples, keeping the same orientation in the 96-well PCR plate as the original RNA plate.Mix the solutions by gently pipetting up and down five times and spin down the plate briefly at 1,000× g for 30 s at 4 °C.4. Incubate the plate at 65 °C for 5 min in a thermocycler and then transfer the plate immediately to ice. 5. Prepare cDNA synthesis master mix by adding 2 μL of MuLV 10× Buffer, 5 U of M-MuLV Reverse Transcriptase, and 8 U of RNase inhibitor (supplied in the MuLV reverse transcriptase kit) and adjust the reaction volume to 10 μL per reaction using nuclease-free water.6. Aliquot 10 μL of the cDNA synthesis master mix onto each well of the 96-well plate containing the RNA, dNTPs, and primer mix.7. Thoroughly seal the plate using an adhesive plate sealing film, spin down at 1,000× g for 30 s, and incubate at 42 °C for 2 h and then 65 °C for 20 min in a thermocycler.8. Store the synthesized cDNA samples at -80 °C or proceed to library preparation.The sequences in black correspond to ~20 bp nanopore barcode adapter sequences, and the sequences in pink and green correspond to the forward Ad-SL and reverse Ad3endVSG primers, respectively.The Ad-SL pairs to the mRNA splice leader sequence, while the reverse Ad3endVSG primer pairs to the conserved 3′-end of VSG mRNAs.The 8-nucleotide long variable sequence (clonal barcode) unique to each clone is shown in orange and depicted here as X.A complete list of primers is available in Supplementary Information.PBS-G: phosphatebuffered saline (PBS)-glucose; ONT: Oxford Nanopore Technology; PIP5Pase CN: phosphatidylinositol 5-phosphatase (PIP5Pase) conditional null.

A. Computational analysis of the sequenced library
The data analysis described here was performed using a Linux operating system (Ubuntu).The computational resources required will vary depending on the data available for analysis.The Oxford nanopore sequencer will generate fast5 files, which are basecalled to fastq files using the Guppy tool integrated in the MinKNOW software (https://nanoporetech.com/).The fastq files are the input dataset in the analysis shown here.We developed a computational pipeline for the sequencing analysis and detection of VSG switching (Figure 2).The pipeline is run via the vsg-barseq.shscript.A fastq file with 100,906 reads was analyzed using 10 threads and 4 GB of memory and completed in 5 min.The pipeline takes the DNA sequences generated by the nanopore sequencing in fastq format and splits them into subfiles according to the cell barcodes (eight mers) used for DNA-seq library preparation.The output fastq files are named by their barcodes, e.g., CCATGCAT.fastq.A split summary file (.txt) is generated and shows the number of reads per barcode file, the total count of reads analyzed, and the total amount of reads containing barcodes (see Note 5).Sequences from each file are then mapped to the organism genome (here, T. brucei 427 strain) using minimap2, outputting .samfiles.The alignments are then filtered using Samtools to remove supplementary alignments and to keep alignments with a mapping quality score (mapQ) ≥ 20.The resultant .samfiles are sorted and indexed with Samtools resulting in sorted.bamand bam.bai files, respectively.The alignments are counted with featureCounts (package Subread).The top mapped reads, which correspond to the expressed VSG, are selected and compiled in a single tab-delimited file (topmapped.txt)while keeping the original output from featureCounts, which serves for analysis of other genes identified during alignment.The analysis of other identified genes is part of the quality control process.It typically shows genes with low counts reflecting low background noise from library preparation resulting primarily from sequences derived from splice-leader cDNA synthesis (Figure 3E).The vsg-barseq.sh is open code and available on GitHub (https://github.com/cestari-lab/VSG-Bar-seq).The script takes six arguments: 1) Directory of all files (output folders will be created in this directory) Note that T. brucei genome (.fasta) and features (.gff, general feature format) can be downloaded from TritrypDB (https://tritrypdb.org/).TriTrypDB does not provide .gtffiles, but .gfffiles can be used to generate .gtf.We recommend using the gffread tool (https://github.com/gpertea/gffread),also available via Galaxy tools (https://usegalaxy.org/).The command below indicates how to generate a gtf file from a gff file after installing gffread tools.

B. Anticipated results
The analysis of parasite VSG switching starts with treating the cultures to induce the switching of the VSG gene.The treatment conditions and outcomes might differ for different cell lines, so they may need to be optimized.Here, the treatment was the temporal knockdown (24 h) of the PIP5Pase gene, which results in high rates of VSG switching (Touray et al., 2023).VSG switching occurs by alternating transcription between ESs or recombining VSG genes within ESs (Figure 3A).The parasite culture was diluted to obtain approximately 30% of clones from a 96-well plate.We recommend optimizing the dilution of the cells after the treatment and checking cell viability before the VSG-BarSeq experiment.After cDNA synthesis and library amplification with ONT-barcoding primers, we recommend analysis of the product in agarose gel.We often obtain a PCR fragment smear ranging from 200 to 2,000 bp (Figure 3B).After library preparation, we usually obtain a library concentration ranging between 10 and 40 ng/μL with 260/280 and 260/230 ratios of ~1.8 and 2.0, respectively.Library concentrations below 5 ng/µL with considerable deviations from the above 260/280 and 260/230 ratios usually result in inferior quality and low throughput sequencing.Analysis of the sequencing using the vsg-barseq.shscript typically results in 70%-99% genome mapping.Figure 3C and 3D show the results of VSG switching in T. brucei after the temporary knockdown of PIP5Pase.Analysis of 117 clones from the control (no knockdown) showed that none of the cells switched VSGs (Figure 3C).However, after 24 h temporary knockdown, there were 93 switchers out of 94 clones analyzed.There was a preference for the cells to switch to VSG8 in the BES12, suggesting transcriptional switching.Moreover, switching to VSGs from subtelomeric regions (Chr2_5A and Chr9_3A) was also detected, indicating switching by recombination (Figure 3C  and D).The analysis does not require a significant amount of RNAs or sequencing throughput since the experimental setup relies on selecting VSG mRNAs, which are highly abundant (Cestari and Stuart, 2015), and on the analysis of clones rather than heterogeneous cell populations.Analysis of the expressed VSG (signal) compared to other genes (noise) showed a high signal-to-noise ratio, and increasing sequencing depth improved the signal without significantly increasing noise levels (Figure 3E).Although DNA sequencing can be costly, the amount of total RNA for cDNA synthesis per clone in this protocol is minimal (5-20 ng), and sequencing depth required for analysis can be obtained from Oxford nanopore flongle flow cells, which typically produces 500-1,000 megabases of DNA sequencing (i.e., ~100,000 reads per group), thus reducing experimental costs.The results show the method's utility in identifying VSG switching events using multiplexed clonal cell barcodes.We anticipate that, with minimal Published: Dec 20, 2023 modifications of the primers used for the target sequences, the approach can be applied to other gene families or other cell types, including var genes in Plasmodium sp., variant surface proteins in Giardia sp., as well as mucins and mucin-associated proteins in Trypanosoma cruzi.Analogously, the approach can be extended to other organisms or cell lines, including mammalian cells, e.g., T-cell or B-cell repertoire analysis.

General notes
1.If more than 40% of the wells in a 96-well plate are positive, the parasites might not be clonal.We recommend optimizing the dilution of the parasites to obtain approximately a third of the wells from a 96-well plate containing growing parasites.2. We recommend performing a quality control PCR prior to cDNA amplification.Use the same conditions indicated in step D3 but for 35 cycles.Then, analyze 30 µL the amplicons on 1% agarose/TAE electrophoresis gel with 5 µL of Ecostain at 85 Volts for 45 min and visualize using a gel imager.A smear migrating at 200-2,000 bp should be expected in the agarose gel (Figure 3B). 3. We recommend the volume of pooled cDNA sample added to the PCR reaction to be less than one-tenth of the total PCR reaction volume.4. Reverse transcriptase enzymes are known to inhibit PCR, particularly at low template concentrations (Chandler et al., 1998).Therefore, adding more cDNA to the PCR reactions usually results in non or very little amplification. 5.The split summary file helps to identify the splitting of reads into multiple barcode fastq files.If the number of reads with barcodes is larger than the total number of reads, it is indicative that some reads are included in more than one barcode fastq file.Primers with longer barcodes than eight mers could be used.
/v ethanol by transferring 80 mL of 100% Ethyl alcohol solution to a 100 mL graduated cylinder.Then, add 20 mL of Nanopure MilliQ water.Store at 4 °C.The solution is stable for one month.

Figure 1 .
Figure 1.Schematic representation of clonal cell barcode and nanopore sequencing protocol.(A) Diagram of the clonal cell barcoding, nanopore sequencing workflow showing the parasite treatment, and cloning (Step 1), RNA isolation from the individual clones, cDNA synthesis, and clonal barcoding (Step 2), and Oxford nanopore technology (ONT) library preparation, sequencing, and data analysis using VSG-BarSeq script (Step 3).(B) Scheme of the forward Adaptor Splice Leader (Ad-SL) primer and the reverse Adaptor-barcode-3′-end VSG primer (Ad-3endVSG) used for cDNA synthesis and clonal cell barcoding.(C) Diagram describing the cDNA synthesis, clonal barcoding, ONT library barcoding, and PCR

Figure 2 .
Figure 2. Flowchart of computational analysis using VSG-BarSeq.Multiplexed reads from clonal VSG-seq are split into clone-specific reads based on eight mers barcode.Reads are aligned to the genome using minimap2 and filtered with Samtools to remove supplementary and secondary alignments and keep alignments with mapQ ≥ 10.Then, it counts the alignments per gene and reports the top alignment per file, corresponding to the expressed variant surface glycoproteins (VSG) gene.

15 PublishedFigure 3 .
Figure 3. Analysis of variant surface glycoproteins (VSG) switching after PIP5Pase knockdown using VSG-BarSeq.(A) Diagram of bloodstream-form expression sites (ESs) (BES) and sub-telomeric regions containing VSG genes used for recombination.The pink arrowhead represents a BES promoter.VSG genes are transcribed from BESs only.(B) Thirtyfive cycles of quality-control PCR amplification of pooled clonal VSG barcoded cDNAs.PCR was amplified with ONT barcode primers.NC: negative control, M: 5 kb DNA ladder.(C) VSG genes expressed by clones of T. brucei without PIP5Pase knockdown (Tet +) and after temporary knockdown for 24 h followed by PIP5Pase re-expression and cloning for 5-7 (Tet -/+).Annotated names and corresponding chromosomes or BES identify VSG genes.The number of clones analyzed is indicated in parentheses.(D) Read coverage plots of example clones from Tet + or Tet -/+ treatment groups.The diagram on the top right summarizes the experiment and results-all clones derived from an original clone expressing VSG2 (BES 1).Graphs show read coverage of expressed VSGs on the same scale.(E) Signal-to-noise ratios of two different VSG-BarSeq libraries differing in sequencing depth.Sequencing depth is shown above plot bars as the total reads by the mean read length.Mb: megabases.

. Phosphate-buffered saline-glucose (PBS-G) (1,000 mL) Reagent Final concentration Quantity or Volume
Dissolve salts in 900 mL of water.Add D-Glucose and stir thoroughly to dissolve it.Adjust the pH to 7.0 and filter sterilize using a 0.22 µm filter.Store at 4 °C.This buffer is stable for three months at 4 °C.